Evidential techniques in parallel database mining

نویسندگان

  • Sarabjot S. Anand
  • David A. Bell
  • John G. Hughes
  • Mary Shapcott
چکیده

Realisation of the fact that stored masses of data contain more information than what is obvious has led to a great interest in the field of Database Mining in the last couple of years. While hardware requirements for storage of these masses of data have advanced rapidly with the demand as have software methodologies for storage, manipulation and reporting of the data, little progress has been made in methods for automatically analysing the data and extracting knowledge stored implicitly within the data. This process of “reading between the lines” is called Database Mining (DM). Clearly, the process of DM is a difficult one. This is due to the fact that methods required to achieve the goal of discovering knowledge are complex and data intensive. In this paper we explain how high performance computing can play a vital role in DM and discuss the implementation of a specific algorithm, STRIP (Strong Rule Induction in Parallel) [ANAN94b, ANAN95] developed by the authors for the discovery of Strong or “almost exact” rules from databases. STRIP is the first algorithm to be implemented within a parallel framework for Database Mining based on Evidence Theory (EDM) [ANAN94a] developed by the authors. In an earlier paper we discussed the different levels of parallelism within STRIP and demonstrated them using a transputer network [ANAN95]. In this paper we discuss the implementation of STRIP on a cluster of Silicon Graphics Workstations connected using an ATM network.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Mining Frequent Itemsets from Evidential Databases

Association rule mining (ARM) problem has been extensively tackled in the context of perfect data. However, real applications showed that data are often imperfect (incomplete and/or uncertain) which leads to the need of ARM algorithms that process imperfect databases. In this paper we propose a new algorithm for mining frequent itemsets from evidential databases. We introduce a new structure ca...

متن کامل

Mining Frequent Itemsets in Evidential Database

Mining frequent patterns is widely used to discover knowledge from a database. It was originally applied on Market Basket Analysis (MBA) problem which represents the Boolean databases. In those databases, only the existence of an article (item) in a transaction is defined. However, in real-world application, the gathered information generally suffer from imperfections. In fact, a piece of infor...

متن کامل

Data Mining: a Database Perspective

Data mining on large databases has been a major concern in research community , due to the diiculty of analyzing huge volumes of data using only traditional OLAP tools. This sort of process implies a lot of computational power, memory and disk I/O, which can only be provided by parallel computers. We present a discussion of how database technology can be integrated to data mining techniques. Fi...

متن کامل

Data Mining: a Database Perspective

Data mining on large databases has been a major concern in research community, due to the di culty of analyzing huge volumes of data using only traditional OLAP tools. This sort of process implies a lot of computational power, memory and disk I/O, which can only be provided by parallel computers. We present a discussion of how database technology can be integrated to data mining techniques. Fin...

متن کامل

Classification with Evidential Associative Rules

Mining database provides valuable information such as frequent patterns and especially associative rules. The associative rules have various applications and assets mainly data classification. The appearance of new and complex data support such as evidential databases has led to redefine new methods to extract pertinent rules. In this paper, we intend to propose a new approach for pertinent rul...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995